555 research outputs found

    Content management for inter-organizational projects using e-mail metaphor

    Full text link
    Inter-organizational projects involve the creation, modi-fication and management of content. Unless carefully han-dled, these overheads can cause the loss of mutual under-standing. In this paper, we propose a content manage-ment approach for inter-organizational projects that uses the e-mail metaphor; the functions of creating, modifying, and managing content are represented as e-mail messages that are sent to project members automatically. The e-mail metaphor allows us to communicate the creation, modi-fication and management of content explicitly across or-ganizations. It promotes mutual understanding in inter-organizational projects. We have developed a content management system that uses the e-mail metaphor. When one project member adds content to the project, the system informs this event to the other project members. When the project manager manages the content by grouping, commenting or semantic annota-tion, the system informs to project members of the opera-tion by e-mail metaphor such as creating a mailbox for a new group. 1

    Domain-Specific Web Search with Keyword Spices

    Get PDF
    Domain-specific web search engines are effective tools for reducing the difficulty in acquiring information from the web. Existing methods for building domain-specific web search engines require human expertise or specific facilities. However, we can build a domain-specific search engine simply by adding domain specific keywords called "keyword spices" to the user's input query and forwarding it to a generalpurpose web search engine. Keyword spices can be effectively discovered from web documents using machine learning technologies. This paper will describe domain-specific web search engines that use keyword spices for locating cooking recipes, restaurants, and used cars. To fully automate the construction of domain-specific search engines, we also present trials of using web pages in an existing web directory as training examples

    Similarity Cluster of Indonesian Ethnic Languages

    Get PDF
    Lexicostatistic and language similarity clusters are useful for computational linguistic researches that depends on language similarity or cognate recognition. Nevertheless, there are no published lexicostatistic/language similarity cluster of Indonesian ethnic languages available. We formulate an approach of creating language similarity clusters by utilizing ASJP database to generate the language similarity matrix, then generate the hierarchical clusters with complete linkage and mean linkage clustering, and further extract two stable clusters with high language similarities. We introduced an extended k-means clustering semi-supervised learning to evaluate the stability level of the hierarchical stable clusters being grouped together despite of changing the number of cluster. The higher the number of the trial, the more likely we can distinctly find the two hierarchical stable clusters in the generated k-clusters. However, for all five experiments, the stability level of the two hierarchical stable clusters is the highest on 5 clusters. Therefore, we take the 5 clusters as the best clusters of Indonesian ethnic languages. Finally, we plot the generated 5 clusters to a geographical map

    Plan Optimization to Bilingual Dictionary Induction for Low-Resource Language Families

    Get PDF
    Creating bilingual dictionary is the first crucial step in enriching low-resource languages. Especially for the closely-related ones, it has been shown that the constraint-based approach is useful for inducing bilingual lexicons from two bilingual dictionaries via the pivot language. However, if there are no available machine-readable dictionaries as input, we need to consider manual creation by bilingual native speakers. To reach a goal of comprehensively create multiple bilingual dictionaries, even if we already have several existing machine-readable bilingual dictionaries, it is still difficult to determine the execution order of the constraint-based approach to reducing the total cost. Plan optimization is crucial in composing the order of bilingual dictionaries creation with the consideration of the methods and their costs. We formalize the plan optimization for creating bilingual dictionaries by utilizing Markov Decision Process (MDP) with the goal to get a more accurate estimation of the most feasible optimal plan with the least total cost before fully implementing the constraint-based bilingual lexicon induction. We model a prior beta distribution of bilingual lexicon induction precision with language similarity and polysemy of the topology as α\alpha and β\beta parameters. It is further used to model cost function and state transition probability. We estimated the cost of all investment plan as a baseline for evaluating the proposed MDP-based approach with total cost as an evaluation metric. After utilizing the posterior beta distribution in the first batch of experiments to construct the prior beta distribution in the second batch of experiments, the result shows 61.5\% of cost reduction compared to the estimated all investment plan and 39.4\% of cost reduction compared to the estimated MDP optimal plan. The MDP-based proposal outperformed the baseline on the total cost.Comment: 29 pages, 16 figures, 9 tables, accepted for publication in ACM TALLI

    Plan Optimization for Creating Bilingual Dictionaries of Low-Resource Languages

    Get PDF
    The constraint-based approach has been proven useful for inducing bilingual lexicons for closely-related low- resource languages. When we want to create multiple bilingual dictionaries linking several languages, we need to consider manual creation by bilingual language experts if there are no available machine-readable dictionaries are available as input. To overcome the difficulty in planning the creation of bilingual dictionaries, the consideration of various methods and costs, plan optimization is essential. We adopt the Markov Decision Process (MDP) in formalizing plan optimization for creating bilingual dictionaries; the goal is to better predict the most feasible optimal plan with the least total cost before fully implementing the constraint-based bilingual dictionary induction framework. We define heuristics based on input language characteristics to devise a baseline plan for evaluating our MDP-based approach with total cost as an evaluation metric. The MDP-based proposal outperformed heuristic planning on the total cost for all datasets examined

    A Generalized Constraint Approach to Bilingual Dictionary Induction for Low-Resource Language Families

    Get PDF
    The lack or absence of parallel and comparable corpora makes bilingual lexicon extraction a difficult task for low-resource languages. The pivot language and cognate recognition approaches have been proven useful for inducing bilingual lexicons for such languages. We propose constraint-based bilingual lexicon induction for closely related languages by extending constraints from the recent pivot-based induction technique and further enabling multiple symmetry assumption cycle to reach many more cognates in the transgraph. We fur- ther identify cognate synonyms to obtain many-to-many translation pairs. This article utilizes four datasets: one Austronesian low-resource language and three Indo-European high-resource languages. We use three constraint-based methods from our previous work, the Inverse Consultation method and translation pairs generated from Cartesian product of input dictionaries as baselines. We evaluate our result using the met- rics of precision, recall, and F-score. Our customizable approach allows the user to conduct cross validation to predict the optimal hyperparameters (cognate threshold and cognate synonym threshold) with various combination of heuristics and number of symmetry assumption cycles to gain the highest F-score. Our pro- posed methods have statistically significant improvement of precision and F-score compared to our previous constraint-based methods. The results show that our method demonstrates the potential to complement other bilingual dictionary creation methods like word alignment models using parallel corpora for high-resource languages while well handling low-resource languages

    Avatar Culture:Cross-Cultural Evaluations of Avatar Facial Expressions

    Get PDF

    Ontology extraction from tables on the Web

    Full text link
    Previous works on information extraction from tables make use of prior knowledge such as a cognition model of tables or lexical knowledge bases for specific domains. However, we often need to interpret table structures in each table differently and to treat lexicons in various domains to more fully utilize the broad range of tables available on the Web. The method proposed in this paper uses relations represented by structures to extract an ontology from a ta-ble. Once the interpretations of table structures are given by humans, the table structures are automatically general-ized to extract relations from the whole table. We define a formal representation of generalized table structure based on the adjacency of cells and iterative structures. As the re-sult of the comparison with a method proposed in a previous work, it was shown that our method is suited to extraction of various relations which are needed for descriptions in RDF/OWL.
    • …
    corecore